智能论文笔记

Two-Step Color-Polarization Demosaicking Network

Vy Nguyen , Masayuki Tanaka , Yusuke Monno , Masatoshi Okutomi

分类：计算机视觉

2022-09-13

场景中光的极化信息对于各种图像处理和计算机视觉任务很有价值。平面偏光仪是一种有前途的方法，可以一次性地捕获不同方向的极化图像，而它需要颜色极化的表现。在本文中，我们提出了一个两步的颜色偏振化学网络〜（TCPDNET），该网络由两个颜色的表演和极化演示组成。我们还引入了YCBCR颜色空间中的重建损失，以提高TCPDNET的性能。实验比较表明，TCPDNET在极化图像的图像质量和Stokes参数的准确性方面优于现有方法。

translated by 谷歌翻译

Real-to-Sim: Deep Learning with Auto-Tuning to Predict Residual Errors using Sparse Data

Alexander Schperberg , Yusuke Tanaka , Feng Xu , Marcel Menner , Dennis Hong

分类：机器人 | 机器学习

2022-09-07

实现接近真实机器人的高度准确的运动学或模拟器模型可以促进基于模型的控制（例如，模型预测性控制或线性质量调节器），基于模型的轨迹计划（例如，轨迹优化），并减少增强学习方法所需的学习时间。因此，这项工作的目的是学习运动学和/或模拟器模型与真实机器人之间的残余误差。这是使用自动调节和神经网络实现的，其中使用自动调整方法更新神经网络的参数，该方法应用了从无味的Kalman滤波器（UKF）公式进行方程式。使用此方法，我们仅使用少量数据对这些残差错误进行建模 - 当我们直接从硬件操作中学习改善模拟器/运动学模型时，这是必要的。我们演示了关于机器人硬件（例如操纵器组）的方法，并表明，通过学习的残差错误，我们可以进一步缩小运动学模型，模拟和真实机器人之间的现实差距。

translated by 谷歌翻译

Simultaneous Contact-Rich Grasping and Locomotion via Distributed Optimization Enabling Free-Climbing for Multi-Limbed Robots

Yuki Shirai , Xuan Lin , Alexander Schperberg , Yusuke Tanaka , Hayato Kato , Varit Vichathorn , Dennis Hong

分类：机器人 | 人工智能

2022-07-04

尽管腿部机器人的运动计划表现出了巨大的成功，但具有灵活的多指抓握的腿部机器人的运动计划尚未成熟。我们提出了一个有效的运动计划框架，用于同时解决运动（例如，质心动力学），抓地力（例如，贴片接触）和触点（例如步态）问题。为了加速计划过程，我们建议基于乘数的交替方向方法（ADMM）提出分布式优化框架，以求解原始的大型混合构成非整数非线性编程（MINLP）。最终的框架使用混合构成二次编程（MIQP）来求解联系人和非线性编程（NLP）来求解非线性动力学，这些动力学在计算方面更可行，对参数较不敏感。此外，我们通过微蜘蛛抓手从极限表面明确执行补丁接触约束。我们在硬件实验中演示了我们提出的框架，这表明多限制机器人能够实现各种动作，包括在斜坡角度45 {\ deg}的情况下进行较短的计划时间。

translated by 谷歌翻译

SCALER: A Tough Versatile Quadruped Free-Climber Robot

Yusuke Tanaka , Yuki Shirai , Xuan Lin , Alexander Schperberg , Hayato Kato , Alexander Swerdlow , Naoya Kumagai , Dennis Hong

分类：机器人

2022-07-04

本文介绍了Scalucs，这是一种四足动物，该机器人在地上，悬垂和天花板上爬上攀爬，并在地面上爬行。 Scaleer是最早的自由度四束机器人之一，可以在地球的重力下自由攀爬，也是地面上最有效的四足动物之一。在其他最先进的登山者专门攀登自己的地方，Scaleer承诺使用有效载荷\ Textit {和}地面运动实践自由攀爬，这实现了真正的多功能移动性。新的攀登步态滑冰步态通过利用缩放器的身体连锁机制来增加有效载荷。 Scaleer在地面上达到了最大归一化的运动速度，即$ 1.87 $ /s，$ 0.56 $ m /s，$ 1.2 $ /min，或$ 0.42 $ m /min /min的岩石墙攀爬。有效载荷能力达到地面上缩放器重量的233美元，垂直墙上的$ 35 $％。我们的山羊抓手是一种机械适应的两指抓手，成功地抓住了凸凸和非凸的对象，并支持缩放器。

translated by 谷歌翻译

Auto-Calibrating Admittance Controller for Robust Motion of Robotic Systems

Alexander Schperberg , Yuki Shirai , Xuan Lin , Yusuke Tanaka , Dennis Hong

分类：机器人

2022-07-03

我们展示了一个具有自动调整的入口控制器，该控制器可用于单个和多点接触机器人（例如，带有点脚或多指握把的腿部机器人）。控制器的目标是跟踪每个接触点的扳手轮廓，同时考虑旋转摩擦引起的额外扭矩。我们的接收控制器在在线操作期间具有自适应性，该方法通过自动调整方法调整了控制器的收益，同时遵循几个培训目标，以促进控制器稳定性，例如尽可能接近跟踪扳手配置文件，以确保控制输出在实力之内限制最小化滑移并避免运动学奇异性。我们使用多限制的攀登机器人来证明控制器在硬件上的鲁棒性，用于操纵和运动任务。

translated by 谷歌翻译

Aggregated Multi-output Gaussian Processes with Knowledge Transfer Across Domains

Yusuke Tanaka , Toshiyuki Tanaka , Tomoharu Iwata , Takeshi Kurashima , Maya Okawa , Yasunori Akagi , Hiroyuki Toda

分类： (统计)机器学习 | 机器学习

2022-06-24

汇总数据通常出现在社会经济和公共安全等各个领域。汇总数据与点不关联，而与支持（例如，城市中的空间区域）相关联。由于支撑物可能取决于属性（例如贫困率和犯罪率），因此对此类数据进行建模并不直接。本文提供了一个多输出高斯流程（MOGP）模型，该模型使用各自粒度的多个聚合数据集侵入属性的功能。在提出的模型中，每个属性的函数被认为是建模为独立潜在GPS的线性混合的依赖GP。我们设计一个具有每个属性聚合过程的观察模型；该过程是GP在相应支持上的组成部分。我们还引入了混合权重的先验分布，该分布可以通过共享先验来跨域（例如城市）进行知识转移。在这种情况下，这是有利的，因为城市中的空间汇总数据集太粗糙而无法插值。提出的模型仍然可以通过利用其他城市中的聚合数据集来准确地预测属性。提出的模型的推断是基于变异贝叶的，它使人们能够使用来自多个域的聚合数据集学习模型参数。该实验表明，所提出的模型在改善现实世界数据集上的粗粒骨料数据的任务中胜过：北京的空气污染物的时间序列以及来自纽约市和芝加哥的各种空间数据集。

translated by 谷歌翻译

Case-based similar image retrieval for weakly annotated large histopathological images of malignant lymphoma using deep metric learning

Noriaki Hashimoto , Yusuke Takagi , Hiroki Masuda , Hiroaki Miyoshi , Kei Kohno , Miharu Nagaishi , Kensaku Sato , Mai Takeuchi , Takuya Furuta , Keisuke Kawamoto

分类：计算机视觉

2021-07-08

在本研究中，我们提出了一种基于病例的新型图像检索（SIR）方法，用于苏木精和曙红（H＆E）染色的恶性淋巴瘤的组织病理学图像。当将整个幻灯片图像（WSI）用作输入查询时，希望能够通过重点关注病理上重要区域（例如肿瘤细胞）中的图像斑块来检索相似情况。为了解决这个问题，我们采用了基于注意力的多个实例学习，这使我们能够在计算案例之间的相似性时专注于肿瘤特异性区域。此外，我们采用对比度距离度量学习将免疫组织化学（IHC）染色模式纳入有用的监督信息，以定义异质性恶性淋巴瘤病例之间的适当相似性。在对249例恶性淋巴瘤患者的实验中，我们证实该方法比基线基于病例的SIR方法表现出更高的评估措施。此外，病理学家的主观评估表明，我们使用IHC染色模式的相似性度量适用于代表恶性淋巴瘤H＆E染色组织图像的相似性。

translated by 谷歌翻译

Multimodal Sequential Generative Models for Semi-Supervised Language Instruction Following

Kei Akuzawa , Yusuke Iwasawa , Yutaka Matsuo

分类：机器学习 | 人工智能 | 自然语言处理

2022-12-29

Agents that can follow language instructions are expected to be useful in a variety of situations such as navigation. However, training neural network-based agents requires numerous paired trajectories and languages. This paper proposes using multimodal generative models for semi-supervised learning in the instruction following tasks. The models learn a shared representation of the paired data, and enable semi-supervised learning by reconstructing unpaired data through the representation. Key challenges in applying the models to sequence-to-sequence tasks including instruction following are learning a shared representation of variable-length mulitimodal data and incorporating attention mechanisms. To address the problems, this paper proposes a novel network architecture to absorb the difference in the sequence lengths of the multimodal data. In addition, to further improve the performance, this paper shows how to incorporate the generative model-based approach with an existing semi-supervised method called a speaker-follower model, and proposes a regularization term that improves inference using unpaired trajectories. Experiments on BabyAI and Room-to-Room (R2R) environments show that the proposed method improves the performance of instruction following by leveraging unpaired data, and improves the performance of the speaker-follower model by 2\% to 4\% in R2R.

translated by 谷歌翻译

Polarimetric Multi-View Inverse Rendering

Jinyu Zhao , Yusuke Monno , Masatoshi Okutomi

分类：计算机视觉

2022-12-24

A polarization camera has great potential for 3D reconstruction since the angle of polarization (AoP) and the degree of polarization (DoP) of reflected light are related to an object's surface normal. In this paper, we propose a novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering (Polarimetric MVIR) that effectively exploits geometric, photometric, and polarimetric cues extracted from input multi-view color-polarization images. We first estimate camera poses and an initial 3D model by geometric reconstruction with a standard structure-from-motion and multi-view stereo pipeline. We then refine the initial model by optimizing photometric rendering errors and polarimetric errors using multi-view RGB, AoP, and DoP images, where we propose a novel polarimetric cost function that enables an effective constraint on the estimated surface normal of each vertex, while considering four possible ambiguous azimuth angles revealed from the AoP measurement. The weight for the polarimetric cost is effectively determined based on the DoP measurement, which is regarded as the reliability of polarimetric information. Experimental results using both synthetic and real data demonstrate that our Polarimetric MVIR can reconstruct a detailed 3D shape without assuming a specific surface material and lighting condition.

translated by 谷歌翻译

Co-evolving morphology and control of soft robots using a single genome

Fabio Tanaka , Claus Aranha

分类：人工智能

2022-12-22

When simulating soft robots, both their morphology and their controllers play important roles in task performance. This paper introduces a new method to co-evolve these two components in the same process. We do that by using the hyperNEAT algorithm to generate two separate neural networks in one pass, one responsible for the design of the robot body structure and the other for the control of the robot. The key difference between our method and most existing approaches is that it does not treat the development of the morphology and the controller as separate processes. Similar to nature, our method derives both the "brain" and the "body" of an agent from a single genome and develops them together. While our approach is more realistic and doesn't require an arbitrary separation of processes during evolution, it also makes the problem more complex because the search space for this single genome becomes larger and any mutation to the genome affects "brain" and the "body" at the same time. Additionally, we present a new speciation function that takes into consideration both the genotypic distance, as is the standard for NEAT, and the similarity between robot bodies. By using this function, agents with very different bodies are more likely to be in different species, this allows robots with different morphologies to have more specialized controllers since they won't crossover with other robots that are too different from them. We evaluate the presented methods on four tasks and observe that even if the search space was larger, having a single genome makes the evolution process converge faster when compared to having separated genomes for body and control. The agents in our population also show morphologies with a high degree of regularity and controllers capable of coordinating the voxels to produce the necessary movements.

translated by 谷歌翻译